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Abstract 

Internet advertising is a sophisticated game in which the many advertisers "play" to optimize 
their return on investment. There are many "targets" for the advertisements, and in each 
"target" is a collection of games with a potentially different set of players are involved. In this 
paper, we study the problem of how advertisers allocate their budget across these "targets". 
In particular, we focus on formulating their best response strategy as an optimization problem. 
Advertisers have a set of keywords ("targets") and some stochastic information about the future, 
namely a probability distribution over scenarios of cost vs click combinations. This summarizes 
the potential states of the world assuming that the strategies of other players are fixed. Then, 
the best response can be abstracted as stochastic budget optimization problems to figure out 
how to spread a given budget across these keywords to maximize the expected number of clicks. 

We present the first known non-trivial poly-logarithmic approximation for these problems as 
well as the first known hardness results of getting better than logarithmic approximation ratios 
in the various parameters involved. We also identify several special cases of these problems of 
practical interest, such as with fixed number of scenarios or with polynomial-sized parameters 
related to cost, which are solvable either in polynomial time or with improved approximation 
ratios. Stochastic budget optimization with scenarios has sophisticated technical structure. Our 
approximation and hardness results come from relating these problems to a special type of (0/1, 
bipartite) quadratic programs inherent in them. Our research answers some open problems 
raised by the authors in (Stochastic Models for Budget Optimization in Search-Based Advertising, 
Algorithmica, 58 (4), 1022-1044, 2010). 

1 Introduction 



This paper deals with the problem of how advertisers allocate their budget in Internet advertising. 
In sponsored search, users who pose queries to internet search engines are not only provided search 
results, but also a small set of text ads. These ads are chosen from a set of campaigns set up by 
advertisers based on the keywords in the search query. A lot of focus has been on how these ads 
are chosen and priced, which is via an auction that is by now well known [2,9,2s] 1 . Our focus is 
instead on the problem faced by advertisers. Even small advertisers have many keywords, a budget 
in mind and must figure out how to spread this budget on bids for each of these keywords. This 
is a highly nontrivial task, and the basis for a separate industry to support advertisers. A similar 
problem arises with "display ads" where advertisers have websites where their ads will be shown 

1 Likewise, there was a lot of work on bidding strategies [3, 12, 21, 26]. This paper extends that body of work by 
considering a richer model of uncertainty; see subsequent paragraphs. 
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and need to split their budget for the ad campaign across the sites to be most effective. Likewise, in 
behavioral targeting, advertisers have to decide how to spread their budget across behavior groups. 
In all these cases, therefore, advertisers have various "targets" and wish to split their budget across 
them to optimize their ad campaigns. 

Consider the sponsored search example and fix an advertiser A. They have many keywords that 
they would like to target for their ads. How should they bid for each, given some overall budget 
they can spend? There is a sophisticated underlying game in which the many advertisers "play" 
to optimize their return on investment simultaneously. For each keyword and for each instance 
of auction triggered by this keyword, there is potentially a different set of competing advertiser 
involved. Building effective strategies is challenging amidst so many parameters. A fundamental 
and widely accepted proposal is for the advertiser A to pursue a best response strategy, i.e., fix 
the strategies of other advertisers and pick the best strategy as one's response. Besides being a 
simple and easy strategy to understand and hence suitable for experimentation by advertisers, best 
response has desirable properties. For example, in the absence of budgets and for single repeated 
auction, special type of best response by every player leads to the VCG outcome [4, 5, 9, 23]. 

In order to help the advertisers implement this best response strategy, search engines provide 
them with expected bid versus clicks function for each keyword 2 . Assuming that the rest of the 
world is fixed, these functions provide an estimate of the expected number of clicks an advertiser 
would obtain by bidding a certain value on that keyword. These functions can also be "learned" by 
an advertiser to some extent by systematically trying out various bids. Finding advertiser's best 
response bidding strategy then becomes an optimization problem where the goal is to maximize 
the expected number of clicks assuming access to these functions. The resulting problems are in 
the spirit of the Knapsack problem [3, 12, 21, 26] with many of them solvable nearly exactly or with 
constant factor approximations 3 

A more general approach is to acknowledge that, in reality, the bids vs clicks functions are not 
fixed, but rather random variables with unknown correlations and uncertainties: number of queries 
(and hence, clicks and budget spent on a keyword) change each day, relative occurrences of keywords 
change (e.g., searches for beach and snow are complementary 4 ), and so on. Therefore, one has to 
consider a specific stochastic model for these random variables and then maximize the expected 
number of clicks under that model. This approach was initiated in [21] leading to a stochastic 
budget optimization problem that is studied in this paper. 

1.1 Organization of the paper 

For convenience of the readers, we organize the rest of the paper in the following manner. 

• We start with Section 2 which describes all of our stochastic budget optimization models and 
corresponding computational problems precisely, starting from the simplest one, together 
with some comments and justifications about the model. In the last subsection of this section 
(Section 2.5), we fix some notational uniformity for readers convenience. 

• In Section 3, we summarize the results obtained in this paper. For the benefit of the reader, 
we group the results into two categories, namely a set of main results that deal with the com- 

2 See, for example, Traffic Estimator at http : //adwords .google . com/support/aw/bin/ answer .py?hl=en&answer= 
8692, bidding tutorial at http: //adwords .google . com/support/aw/bin/answer .py?hl=en&answer=163828 and bid 
simulator at http: //adwords .google . com/support/aw/bin/ answer . py?hl=enfeanswer=138148 

3 The scenario model was introduced in [21]. For a very detailed discussion of prior works related 
to the approach in the model, see Section 1.4 of [21]. 

4 See www. google . com/trends?q=beachY/,2C+snow&ctab=0&geo=all&date=all&sort=0 for yearly and www. google . 
com/trends?q=clubsY/,2C+stocks&ctab=0&geo=all&date=mtd&sort=0 for weekly trends. 
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putational complexity issues of the original models without restrictions and a set of additional 
results that deal with variations and special cases of the models defined in Section 2. 

The remaining sections of the paper, excluding conclusion and references, deal with precise state- 
ments of our results and technical details of their proofs. For complex proofs, we first provide a 
more informal overview of the steps in the proof before proceeding with technical details. These 
sections are organized in the following manner. 

• In Section 4 we discuss the quadratic integer programming reformulations of the various Sbo 
problems. 

• In Section 5 we state and prove our poly-logarithmic approximation algorithms for Ssbo and 
Multi-Ssbo problems (main result (Rl)). 

• In Section 6, we state and prove our approximation-hardness results for both SSBO and 
Multi-Ssbo problems (main result (R2)). 

• Section 7 contain all other results: 

— In Section 7.1 we show that many Ssbo problems have improved solutions if certain 
parameters are restricted in their range of values. 

- In Section 7.2 we show the limitations of semidefinite programming based approaches 
for solving SSBO problems. 



2 Scenario Model for Stochastic Budget Optimization 

We discuss the model and related problems using the language of sponsored search 5 . We use the 
suffix Ssbo (Scenario Stochastic Budget Optimization) for various acronyms for different versions of 
our problems. For the convenience of the readers and to delay introducing more involved notations, 
we first start with a slightly simpler version of the model involving only one slot. We refer to this 
version as the "uniform cost" case and describe it in the next section. 

2.1 Single Slot Case: Uniform Cost Model 

This basic model starts with the following assumptions: 

• There is a single slot for advertising. 

• We have a set of n keywords /Ci, /C2, • • • , K-n with the keyword ICj having a cost-per- click dj 
(a positive integer). 

• We have a positive integer B denoting the budget for the advertiser. 

• We have a collection of m "scenarios" where the ith scenario is characterized by the following 
parameters: 

— A probability of £j (YliLi e « = !)• 

— A "click vector" (0^1,0^2, • • • ,a«,n) where each ay > is an integer. Each ay denotes 
the number of clicks obtained by the jth keyword Kj in the ith scenario. 

Our discussion can easily be adopted to other internet ad channels like display ads and behavioral targeting. 
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Scenarios can be thought of as sampling the model over various times 6 . 

Our general goal is to compute n selection variables xi,X2, ■ ■ ■ ,x n , where Xj corresponds to the 
jth keyword, to maximize a suitable total payoff. A crucial aspect of the discussed formulation is 
that, if the budget is not limiting, then the payoff corresponds to the total number of expected 
clicks, but if the budget turns out to be limiting for any scenario then the payoff scales the total 
number of expected clicks by the fraction that the budget would provide 7 . Based on the above 
intuition, our precise goal is maximize the total expected payoff over all scenarios, i.e., 

m 

maximize Efpayoff] = E [payoff J 
i=l 

where the expected payoff EfpayoffJ for the ith scenario is 

£ i Sj=l a i,j x ji ^ X^7=l a ijdjXj < B 

, ^2 j=l a i,jdjXj 



E[payoffj 



( e * YT 3 =i a i,i x j) > otherwise 1 ' ' 



Following [21], we distinguish between two versions of the problem based on the nature of the 
selection variables: 

Integral version (Uniform-Int-Ssbo): Xj € {0, 1} for all j. This corresponds to the case when 
based on the stochastic information, either the advertiser chooses to win and pay for all clicks 
for a keyword, or not at all. Hence, the strategy of the advertiser is deterministic. 

Fractional version (Uniform-Frac-Ssbo): < Xj < 1 for all j. This can be thought of as a 
strategy in which the advertiser treats these numbers as probabilities and bids for the keywords 
in a randomized fashion based on these probabilities, thereby only winning (and paying for) 
a portion of all clicks and impressions for each keyword. If the deterministic strategy is hard 
to compute and provides a solution of bad quality then the randomized strategy is more 
desirable. 

Other than the scenario model, there are at least two other possible models for stochastic budget 
optimization as discussed in [21]. In the proportional model there is just one global random variable 
for the total number of clicks in the day that keeps the relative proportions of clicks for different 
keywords the same, whereas in the independent keywords model each keyword comes with its own 
probability distribution. However, among all these models this scenario-based model is perhaps one 
of the most natural model of reality and provides an appropriate middle ground between complex 
arbitrary joint probability distribution and a single distribution for all keywords. It was shown 
in [21] that both Uniform-Int-Ssbo and Uniform-Frac-Ssbo are NP-hard. In the sequel, we 
assume without loss of generality that 1 = d\ < di < ■ ■ ■ < d n . 



6 Scenarios can be provided by the search engine for the advertisers, or used by the search engines to bid on behalf 
of advertisers. Similarly, advertisers and other search engine optimizers can also "infer" scenarios indirectly using 
trends and other data provided by search engines. 

7 The underlying assumption is that, within a scenario, the queries and keywords are well-mixed and, when budget 
runs out, the ad campaign is halted for the period as is currently done. The queries and keywords are well-mixed not 
only because of aggregation of streams from millions of users but also because of ad throttling that spreads out the 
eligible ad campaigns over the period of a scenario. See [21] for exact details of justification. 
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2.2 Single Slot Case: General Model 



In a more realistic version of the Ssbo problems the cost-per-click values may vary slightly over a 
range of scenarios due to their small errors in estimation. This can be modeled by introducing a 
stretch parameter (small integer) 8 1 < k = O (poly(log(m + n))). Now, dj stands for the basic cost- 
per-click for the keyword ICj , whereas the real cost-per-click for the keyword ICj in the ith scenario 
is denoted by Cij, with aj <G [dj, ndj) 9 . Then, Equation (1) can be simply updated by replacing dj 
in the equation of the ith scenario by Cjj. We refer to the integral and fractional versions of this 
general case as Int-Ssbo and Frac-Ssbo, respectively; note that the Uniform-Ssbo problems 
are obtained from the corresponding Ssbo problems by setting k = 1. 



2.3 Multi Slot Model 

In the multi-slot case there are s > 1 slots for each keyword with the GSP second price auction for 
these slots. Let djk be an integer denoting the value of the basic cost-per-click the kth slot of the jth 
keyword; we assume without loss of generality that dj^\ < dj t 2 < ••• < dj tS . Let Cjj^ £ [dj t k, ^dj,k) 
denote the value of the real cost-per-click for the kth slot of the jth keyword in the ith scenario 
where k is the stretch parameter as in Section 2.2, and let B > denote the budget (a positive 
integer) for the advertiser. Our goal is now to compute a set of sn selection variables Xj^ where 
the selection variable Xj^ corresponds to kth slot for the jth keyword. We again have a collection 
of m scenarios where the ith scenario is characterized via: 

• a probability E; L (X)£Li e i = 1); an d 

• a "click vector" (0^1,0^2, • • • ,&i,j,s) where each a^fc is a non-negative integer; 
The goal is to compute the allocation variables Xj^s with the constraints 



Vj: j> jifc <1 (2) 



k=i 

to maximize the total expected payoff 



E[payoff] = ^E[payofL] 
i=i 

where 

{ £ i J2j J2k a i,j,kXj,k, if J2j J2k a i,j,kCij,kXj,k < B 

v v — ( £ i Ej Efc Oij,kXj,k) , otherwise (3) 

We again distinguish between two versions of the problem: 

Integral version (Int-Multi-Ssbo): xj^ G {0, 1} for all j and k. Here, Xj^ = 1 if the advertiser 
selects the kth slot for the jth keyword, and Xj : k = otherwise. 



Throughout the paper, the notation poly (a) denotes a polynomial in a, i.e., a c for some positive constant c. 
9 For example, the stretch parameter k allows us to model situations such as when the real costs can be drawn 
from a probability distribution with a mean around ^-^dj with a negligible probability of occurring outside a range 
of ±i^dj of the mean. Note that this is just an illustration. We do not assume any specific probability distribution 
for the variations of the real costs per click except that it varies within an interval of length k. 
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Fractional version (Frac-Multi-Ssbo): < Xj : k < 1 for all j and k. Here, denotes the 
probability that the advertiser selects the kth slot for the jth keyword and 1 — (X]fc=i x j,k) is 
the probability with which the advertiser does not bid on the jth keyword at all. 

Note that the scenario model for multi-slot stochastic budget optimization is quite different in 
nature from the other multi-slot models such as the one discussed in [12] since, for example, one 
can go under or over the budget in one scenario to get a better overall expected payoff. 

2.4 Relevance and Significance of Scenario Models 

Scenario models are a popular way of modeling optimization problems involving uncertainties in 
parameters by creating a number of scenarios that depict the probability distribution of various 
possibilities and then provide a solution that optimizes the expectations of outcomes over these 
scenarios. The scenario model is important for at least two reasons as explained in [21], which we 
state below. Firstly, market analysts often think of uncertainty by explicitly creating a set of a few 
model scenarios, possibly attaching a weight to each scenario. Secondly, the scenario model gives us 
an important tool into understanding the fully general problem with arbitrary joint distributions. 
Allowing the full generality of an arbitrary joint distribution gives us significant modeling power, 
but poses challenges to the algorithm designer. Since a naive explicit representation of the joint 
distribution requires space exponential in the number of random variables, one often represents 
the distribution implicitly by a sampling oracle. A common technique, Sampled Average Approx- 
imation, is to replace the true distribution by a uniform or non-uniform distribution over a set of 
samples drawn by some process from the sampling oracle, effectively reducing the problem to the 
scenario model. In addition to their usual applications in operations research (e.g., see [8]), this 
approach is getting more and more attention in Wall Street as financial portfolios are being created 
in this way (e.g., see [25]). For example, Cocco, Consiglio and Zenios in [7] developed a scenario- 
based optimization model for asset and liability management of participating insurance policies 
with minimum guarantees and Mausser and Rosen in [16] developed three scenario optimization 
models for portfolio credit risk. 

In sponsored search, this is an appropriate model and embodies the "best response" strategy. 
There is a complex function that maps the state of the world and the users to the queries they 
pose and their actions such as whether they click on ads. The search engines give a limited amount 
of information to help advertisers 10 , and advertisers can learn various scenarios that determine 
their click vs cost behaviors to some extent by running experiments, analyzing their web traffic 
etc. However, sponsored search products only provide a limited bidding language to structure one's 
campaign 11 and hence, necessarily, most advertisers have to target different scenarios simultane- 
ously with each bidding choice. This is the stochastic budget optimization problem we study in 
this paper. One natural idea is for advertisers to recognize in real time the particular scenario one 
faces and then apply the best bidding for that scenario. However, this is difficult to do in practice 
because of limited and delayed information in the system, and it is also expensive to implement. 
Thus, stochastic budget optimization problems under the scenario model are very appropriate for 
sponsored search applications. 

We do acknowledge that other strategies besides the "best response" may be used by advertisers 
in practice 12 , and stochastic budget optimization algorithms proposed here are not currently used 
within the practical tools that are publicly available. Nevertheless, best response is a reasonable 

10 For example, see https://adwords.google.com/select/TrafficEstimatorSandbox 
n See for example, http://algo.research.googlepages.com/ec09-partl.pdf 

12 By other strategies, we mean strategies in which the advertiser does not fix the strategies of other advertisers. 
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strategy (even recommended by some search engines), and indeed many anecdotal conversations 
with advertisers and sponsored search optimizers have clearly indicated to us that they would like 
to bid to balance across myriad of scenarios. Our algorithms in this paper (even the dynamic 
programming based ones) can be easily implemented in current systems. 

2.5 Notational Remarks 

As the reader may have already observed, precise definitions of the various models involve a lot 
of variables and subscripts. To make the exposition clearer, we will therefore adopt the following 
conventions: 

• For variables involving keywords, scenarios and (for the multi-slot model) slots, we will use 
subscripts i, j and k (and their obvious variations such as i±, i! , etc.) for scenarios, keywords 
and slots, respectively. 

• Variables such as m, n, Kj, dj, B, Ei, ay, ay^, cy, cy,fc, %j, %j,k, payoff, payoff^ k, s and 
B, when used in the context of the stochastic budget optimization models, will be used for 
their intended meanings as described in Sections 2.1 — 2.3. 

• Note that: 

— m, n, dj, B, ay, ay^, cy, cy^ and s are positive integers; 

- < Si < 1 and YTiU £i < i; 

— 1 < k = O (poly(log(m + n))) is an integer. We refer to this in the sequel by the phrase 
"ft is a small integer" . 

• The size of an input instance of our Sbo problems, which we will denote by size-of-input and 
which is crucial in differentiating polynomial-time algorithms from pseudo-polynomial-time 
algorithms, is as follows: 

- For Int-Ssbo and Frac-Ssbo: 



size-of-input = poly I m + n+ I max log 2 ay I + I max log 2 cy ) + ( max — 

l<i<m ' / \ l<i<m ' / \l<i<m£. 

,l<j<n ) \l<j<n 



For Int-Multi-Ssbo and Frac-Multi-Ssbo, 

/ / \ f 



size-of-input = poly 



s + m + n + 



V 



max log 2 ay k 

l<i<m 
\ l<j<n 

\l<k<s I 



+ 



max log 2 cy fe 

l<i<m 
i<j<n 
Kk<s 



( 1 

+ max — 

U<i<m £j 



On rare occasions, if we need to reuse the above-mentioned indices or variables and thus deviate 
from these conventions, the accompanying text will make the deviation clear. 



3 Summary of Results and Proof Techniques 

[21] left the computational complexity issues of the scenario model as the main open problem after 
showing that both the integral and fractional versions of this problem, even for single slot case, 
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are NP-hard and noting that no non-trivial approximability results are known. While prior results 
for (S)BO problems exploit insights from the Knapsack problem to associate some potential payoff 
with each keyword, a central difficulty encountered in directly applying those techniques for our 
models is that payoff from a keyword can be very different from one scenario to another. 

3.1 Summary of Results 

We provide a slightly coarse summary of the results obtained in this paper; precise bounds arc 
available in the corresponding technical section that proves the result. 

Main Results 

(Rl) (Approximation algorithms): We provide algorithms that run in near-linear time and 
achieve the following approximation ratios 13 : 



(R2) (Approximation hardness for the single slot case) We show that, unless ZPP = NP, 
there exists instances of Int-Ssbo and Frac-Ssbo, with n keywords and m = n scenar- 
ios each with equal probability, such that any polynomial-time algorithm for solving these 
problems must have an approximation ratio of any one of the following (for any constant 



• Q, (re log 1 6 d n ) . 

This almost matches the upper bounds in (Rl). Thus, we cannot in general improve the 
approximation bound in (Rl). 

(R3) (Approximation hardness for the multi-slot case) Since Ssbo problems are special case 
of Multi-Ssbo problems for s = 1, the approximation hardness bounds for Ssbo can be ex- 
tended to Multi-Ssbo in an obvious manner. However, because of the fact that selection 
variables of various slots of the same keyword are dependent on each other via constraints 
such as Equation (2), our lower bound proofs translate to corresponding lower bounds of the 
form Q (m 1_e ), Q (n 1_£ ), or (log re • log 1_£ d n ) for an MULTI-SSBO instance with n key- 
words, m = n scenarios and s slot. Thus, unfortunately, the lower bounds are independent 
of s, though one would expect the computational complexity of the problem to depend on s, 
say when s is large compared to n. 

Using a different amplification of NP-hard problems as suggested by Raz's parallel repetition 
theorem [11, 18, 22], we can show that there exists instances of Int-Multi-Ssbo and Frac- 
Multi-Ssbo with n keywords and s > 1 slots such that any polynomial-time algorithm 
for solving these problems must have an approximation ratio of 2 log £(ns ) for any constant 



< e < 1, provided NP % DTIME(nP o| y( lo s n ) j 14 . As an example, 2 lo § 1 £ ( ns ) dominates n l - £ if 
s = n (n lo s™). 



13 The reader is reminded that k = O (poly(log(m + n))). 

14 More detailed discussions on the generalization of the lower bound for the single-slot case to the multi-slot case 
and comparison of the two lower bounds is available in Section 6.2. 



• min {0(m),0(re log <i n )}-approximation for both Int-Ssbo and Frac-Ssbo and, 

• min (O(m), O (s re log A log 2 (m + n)) } -approximation for Int-Multi-Ssbo and Frac 
Multi-Ssbo, where A = max^c^. 



< e < 1): 
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We also show that Int-Multi-Ssbo is MAX-SNP-hard for s = 2 even when k = 1 and Cj ; k = 1 
for all j and k. 

Other Results 

In addition to the main results, we also prove a number of other results dealing with variations and 
special cases of our problems. 

Fixed parameter tractability issues: For certain parameter ranges of practical interest we 
show that these optimizations problems can be solved efficiently. If m or ns is fixed, Frac- 
Multi-Ssbo has a polynomial time solution with an absolute error of d for any fixed 5 > 0. If 
additionally bids are polynomial in size, Int-Multi-Ssbo also has a polynomial time solution 
with an absolute error of 5 for any fixed 5 > 0. 

Limitations of semi-definite programming based approaches: The lower bounds in (R2) 
have e < 1 and thus leaves a "very small" gap between this lower bound and the upper bounds 
described in (Rl). It is natural to ask if the gap could be eliminated; for example can we 
design an approximation algorithm for the special case for k = 1 whose approximation ratio 

log log ™d n ) ^ Although we are unable to provide a concrete proof that 
such a polynomial time approximation algorithm does not exist, we nonetheless observe that 
the natural semidefinite programming relaxation will not work since it has a large integrality 

s a P° f f = (io|fer)- 

Dual of Ssbo problems: Finally, in some cases, the dual of the stochastic budget optimization 
problem may be of interest, where we are given a target expected number of clicks and the 
goal is to minimize the expected budget spent while reaching the target. We present some 
exact and approximate results for this dual version of the problem. 

3.2 Brief Overview of Proof Techniques 

In general, budget optimization problems are akin to knapsack problems 15 . But the stochastic 
budget optimization problems studied in this paper are different because their budgets are "soft" , 
i.e., they can be exceeded, if under a suitable scaling they meet the budget constraint, and this 
improves the objective function. The stochastic budget optimization problems can be more in- 
sightfully thought of as special bipartite quadratic programs (these with ±1 variables correspond 
to Grothendieck's inequality with a nice history, but we have 0/1 variables). Standard approaches 
to solving other special cases of quadratic programs, for example, using relaxations via semi-definite 
programming, do not provably work as we show. Instead, for upper bounds, we take alternative 
combinatorial approaches. For showing hardness results, we use intuitions from connections of our 
problems to these quadratic programs. For one proof, we show reduction from the hard instances 
of the maximum independent set problem [15] on graphs to the bipartite 0/1 quadratic integer pro- 
gramming reformulations of Frac-Ssbo and iNT-SSBOFor the hardness proofs for multi-slot case, 
we have to start from an inapproximability result of certain type of multi-prover systems to obtain 
the best hardness results, again crucially using the connection to these quadratic programs. While 
anecdotally one may indeed believe these problems to be computationally hard, our results show 
that this is not true for many ranges of parameters of interest, but do identify the parameter settings 
that make them computationally hard. Taken together, our results are the first known non-trivial 

15 See for example, http://algo.research.googlepages.com/ec09_pub.pdf 
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complexity results for stochastic budget optimization problems under the scenario model beyond 
NP-hardness. 



4 Sbo Problems and Bipartite Quadratic Integer Programs 

In this section we show how to reformulate various Sbo problems as bipartite quadratic integer 
programs (Ql P) . These reformulations are heavily used in later proofs in the paper. A bipartite 
quadratic program is a quadratic program in which there is a bipartition of variables such that 
every term involves at most one variable from each partition. A well-known example of such a 
(strict) quadratic program on variables taking ±1 values is the so-called Grothendieck's inequal- 
ity [1]. However, as will show later, our quadratic program differs significantly in nature from this 
inequality. 

4.1 Ssbo and QIP 



(* Quadratic program (Ql) *) 

(* m,j = Vi,j Cij for all i and j *) 

m n 

maximize 

i=l j=l 

subject to 

I " \ 

VI < i < m: cti WjjXj < Bi 

V=i / 

VI < i < m: < on < 1 
Vl<j<n: 0<Xj<l 



(* Quadratic program (Q2) *) 

(* Wi,j,k = Vi,j,kCi,j,k for all i, j and k *) 

m I n s < 

maximize J^ai 

i=l \j=l k=l 

subject to 

VI < i < m: oii ^^Wij,fcXj jfc < Bi 

\j=i k=i J 
Vl<j<n: T.k=i x 3,k<l 
VI < i < m: < < 1 
Vl<j'<nVl<Ks: < x jtk < 1 



Figure 1: Quadratic Integer Programs for Sbo Problems. Y is a matrix with non-negative entries 
(yij for (Ql) and Vij^k for (Q2)) and Bi,B2, ... , B m are positive real numbers. 

We show how to reformulate Ssbo as a bipartite quadratic integer program. Consider the 
quadratic program (Ql) in Fig. 1. By "integral version" of (Ql) we refer to replacing the constraint 
< Xi < 1 by Xi € {0, 1}. 

Proposition 1. The quadratic program (Ql) and its integral version is equivalent to Int-Ssbo or 
Frac-Ssbo, respectively. 

Proof. Consider an instance of Ssbo. Let yij = £iaij, Wij = (Hjyij and Bi = EiB. Intu- 
itively, the BiS correspond to budgets for the ith scenario scaled by the probability of the ith 
scenario, and the yjj's are the Cjj's scaled by the probability of the zth scenario. Then, the in- 
equality z^j=i a i,j c i,j x j — B becomes Y^j=i Vi,j c i,j x j — B% = Y^=i w i,j x j — Bi and the fraction 

B B 
=j becomes -=j — . Conversely, given an instance of (Ql), let B = J2i=iBi, 

Z^j=l a i,j c i,j x j l^j=l w i,j x j 

Bi . yi,j 



B 



and a. 



Thus, £i a i}j = y id , the inequality YJj=i w i,j x j < B i = Y!j=i Vi,j c i,j x j ^ B i 



is the same as Y7j=i a i,j c i,j x j ^ B and the fraction =^ 



Bi 



En 
7 '=1 w i,j x 3 



is the same as 



B 



En 
j = l a i,j c i,j x j 



Thus, in the sequel, we assume such a correspondence. 
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Now, consider a solution vector x — (x±, X2, ■ ■ ■ , x n ) and a = (a±, «2, . . . , ot m ) for (Ql). Then 
x also defines a solution vector for Ssbo. We must verify that this is indeed a valid solution vector 
with a correct expected payoff. Let Qi = Yl^=i w i,j x j- ^ a %Qi < then on = 1 since otherwise 
the solution for (Ql) can be further improved, and then E[payofLJ = Y0j=i Vi,j x jy which is correct. 

Otherwise oiiQi = Bi and then E [payoff J = -— j — YTi=\Vi i x j = a % Yll=i Hi j x j> which is also 

BijOLi J 1 

correct. This shows that for every instance of (Ql) there is a corresponding instance of Ssbo with 
the same expected payoff. 

Now, consider a solution vector x for Ssbo. Then, if Qi > Bi then oii = Bi/Qi otherwise 
a.i = 1. It is easy to see in the same manner that this provides a valid solution of (Ql) with the 
same objective value. □ 



Bi 



Relationship to the Standard Knapsack Problems 

If m = k = 1 and a-i is set to a fixed constant, then '■ '■ v-^<? 

maximize on i^ i=v yi j %j 
(Ql) reduces a special linear program which is equivalent f q ' \ 

to the so-called (fractional) knapsack problem which is sub j ect to a i \52j= P c i,j Vij x j) ^ 
well-studied in the literature. Extending this analogy, by V ' p < j < q: 0<Xj<l 
the phrase "the standard fractional knapsack problem corre- 
sponding to the ith row of and pth through gth column of Fi § ure 2: LP for ith row and pth 
Y" , we will mean the linear program as shown in Fig. 2 through qth column of Y. 
(it is easy to see that there is an optimal solution of this 

linear program in which aj = min < 1, — ^ r >). Since d p < d p+ \ < ■ ■ ■ < d q , the following 

[ {l^j=p c i,3 ! Vi,j) J 

well-known fact follows. 

Fact 1. [13] An optimal solution to the linear program in Fig. 2 ("optimal payoff for the ith row 
and pth through qth column ofY") is a "prefix solution", i.e., there is an index j' such that Xj = 1 
for j < j ' , < Xji < 1 and Xj = for j > j ' . 

4.2 Multi-Ssbo and QIP 

The quadratic programming reformulation of Multi-Ssbo can also be obtained in a similar manner 
and is shown as (Q2) in Fig. 1. 

5 Poly- logarithmic Approximations for SSBO and Multi-Ssbo (main 
result (Rl)) 

Theorem 1 (Near- linear time approximation). There is a 

(i) min (O(m), O (k log d n )}- approximation for both Int-Ssbo and Frac-Ssbo; 

(ii) min {0(m), O (s/tlog A)} -approximation for Frac-Multi-Ssbo and 

(iii) min {O(m), O (sz-clog Alog 2 (m + n))} -approximation for Int-Multi-Ssbo 

where, for (ii) and (iii), A = maxj^ djk ■ All these algorithms can be implemented in linear or 
near-linear time using standard data structures and algorithmic techniques. 
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1. Partition the keywords into maximal groups such that if a group G contains pth through 

gth keyword then d q /d p < 2 and d q+ \/d p > 2. 
Let Q be the set of such groups. 

2. For each group G € Q consisting of keywords, say IC P , /C p +i, . . . , IC q , do 

Set Xj = 1 for every p < j < q; 

let E [payoff'] be the payoff of this solution 

3. Output the best of the solutions obtained in 2. 

Figure 3: Algorithm for the case of k = 1. 



In the rest of this section, we prove the above theorem. As a first attempt, one might be tempted 
to use recent techniques in designing efficient algorithms for multiple-knapsack problems [6, 17] for 
our problem; however it is not difficult to design examples where such approaches fail badly since 
our budget constraints are "soft" (they can be exceeded if scaling them gives better payoff) and 
our probabilities are "arbitrary". As a second attempt, one might take our quadratic programming 
reformulation as discussed in Section 4 and semidefinite-programming based rounding approach 
such as in [14]. However, it can be shown that the integrality gap of such a reformulation is very 
large. The failure of these natural approaches shows the difficulty of the problems. Thus, we are 
led to explore other combinatorial approaches to provide the desired approximation. 



5.1 O(m)-approximation for Int-Ssbo and Frac-Ssbo 

To get a (9(m)-approximation we can do the following. For each i we solve the standard (integer 
or fractional) knapsack problem for the ith row of Y; let pi be the value of an optimal solution. 
Then, take the best of these solutions, say of value p = maxi<j< m {pj}. Each fractional knapsack 
problem can be solved exactly in 0{n log n) time [13] and a 0(n log n) time greedy 2-approximation 
algorithm for the integer knapsack problem is also well known [19]. 

We now note that E [payoff] < Y^T=\Vi- Indeed, consider an optimal solution of Ssbo. If a, L = 1, 
then by definition of pi we have E [payoff J < pi- If on < 1, then we set aj = 1 and set a new value 
of xj as x'j = ociXj. This does not change E[payofLJ and now we again have E[payofLJ < pi. Thus, 
we have p > E [payoff] /m. 

If p = pi for some i, then the solution of the knapsack problem of value p can be extended to a 
solution of Ssbo by setting ay = for i' / i. 



5.2 O (k log ^-approximation for Int-Ssbo and Frac-Ssbo 
Case of k = 1: Uniform Cost Model 

The algorithm is shown in Fig. 3. Consider a group G € Q consisting of the keywords IC P , /C p +i, . . . , KL q . 
By the "Ssbo problem on G" we mean the instance of the Ssbo problem in which our click in- 

(yi, P j/i,p+i ••• yi, q \ 



put consists of the submatrix Y pA 



of Y containing all rows and pth 



\ym,p 2/m,p+l • • • ym,q/ 

through qth columns, the costs-per-click d p , . . . , d q , the budgets B\, 



, B m , and the selection vari- 



ables x p , . . . ,x q . Let E[payoff G ] be the value of expected payoff of an optimal solution for this 
subproblem. Since max^gg E[payoff G ] > ^l£^2S an d = 0(\ogd n ), the following lemma proves 
the desired approximation bound. 
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Lemma 2. Efpayoff'] > fc^i. 

Proof. We only need to prove the lemma for the case when E [payoff G ] is the total expected payoff of 
an optimal solution of the Frac-Ssbo problem on G since obviously the total expected payoff of an 
optimal solution of the Int-Ssbo problem on G is no more than E [payoff G ]. Let D = Yl 9 j= p dj Ui,j 
and (3 = \G\. By our choice of the group G, 

<? Q Q 

j=p j=p j=p 

Using the quadratic programming formulation (Ql) and remembering that Cjj = dj when k = 1, 
the Frac-Ssbo instance on G is equivalent to the following quadratic program (Q3): 



(* Quadratic program (Q3) *) 

maximize YT= l a i (E*=p Vij x j) 

subject to V 1 < % < m: ai (^2j =p dj yij x^j < B { 

Vl<i<m: < a* < 1 

Vp<j<q: 0<Xj<l 

Fix any optimal solution for our Frac-Ssbo instance on G, i.e., fix an optimal solution vector 
(aj,«2) • • • > a m) an d ( x p) x p+i> • • • i x * q ) °f (Q3)- In our solution sets x p = x p+ \ = ■ ■ ■ = x q = 1; 
thus ai = min jl, j for every i and Xj > x* for every p < j < q. 

Case 1: D < Bi . Then, ai = 1 > a*, Xj = 1 > x* for p < j < q, and thus 

I Yl !ll -J x i I ^ a i E !ll -J x 3 
\j=p j \j=p 

Case 2: D > Bi . Then, ai = -^j. Let af and (x+,a^~ +1 , . . . ,x~£) be the optimal prefix solution 
of the ith row and pth through qth. column of Y. By Fact 1, there exists an index 1 < £ < f3 
such that Xj~ = 1 for j < p + £ — 1, < < 1 and x^ = for j > p + £ — 1, and 

p+£—l q q 

B i = a t Y d j x i = a t Y d i y ^ x i - a *i Y d 3 y^ x * 



3 *j 

j=p j= p j= p 



Now, it follows that 
q 



X 3=P 




3=P 

Thus, combining both cases, we have 

E[pay ff'] = f> fc^x] > \ x f>* (± Vi , ; x*\ = 

»=1 \3=P j »=1 \3=P j 



□ 
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Case of k > 1: General Single-slot Model 

Using our (5-approximation algorithm for Uniform-Ssbo (for 5 = (logd n )) as outlined in Fig. 3, 
we show how to use it as a subroutine to get a nd = O (Klogd n )-approximation for Int-Ssbo (and, 
hence, also for Frac-Ssbo). The algorithm is shown in Fig. 4. 



1. Replace (truncate) each Cjj by its new value c^j = dj. 

2. Use the approximation algorithm in Fig. 3 with these new truncated values of Cjj's. 
Let x = (xi, X2, • • • , x n ) and a = (a±, ct2, • • • , ct m ) be the solution vectors returned. 

3. Output x and a' = (a[, a' 2 , ■ ■ ■ , a' m ) = (^l, 2*, . . . , ^) as our solution. 



Figure 4: 0(/«logd n )-approximation algorithm for Int-Ssbo. 
We use the following notations: 

• x* = (x*, x*., • • • , x* ) and ot* = (a*, a*., ■ ■ . , a£J De the solution vectors for an optimal solu- 
tion of our (original) instance of SSBO, and E[payoff*] = YliLi a i (Sj=i Ui,j x j^j is the total 
expected payoff of this optimal solution. 

• x + = (xf, x\ , . . . , x+) and ol + = (af , a^, . . . , a+) be the solution vectors for an optimal 
solution of the truncated instance of SSBO, and E[payoff + ] = YlJLi a t (Sj=i Ui,j x J^j ^ s the 
total expected payoff of this optimal solution. 

• IE [payoff] = Y^iLi a 'i ( y Yl 1 j=iyi,j x j S j i s the total expected payoff of the solution obtained by 
using the algorithm in Fig. 4. 

Proposition 2. The following statements are true: 

(a) x and a' correspond to a valid solution of the SSBO instance. 

(b) E [payoff + ] > E [payoff*]. 

(c) E[payoff] > E[pa ^° ff+1 . 

Thus the algorithm in Fig. J± is a O (n log d n )- approximation. 
Proof. 

(a) a' i Cij = ^c i j < ^Kdj=a>idj, thus a-i Vhi d i x i j - Bi implies a- Vi,j Cj,j Xj^j < 

(b) The solution vectors x* and a* for an optimal solution of the SSBO instance is also a valid (not 
necessarily optimal) solution vector for the truncated instance of Ssbo since < Cjj. 

(c) This follows since a- = ^. □ 
5.3 Approximation Bounds for Frac-Multi-Ssbo and Int-Multi-Ssbo 
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To get a 0(m)-approximation we follow the same ap- ,* ~ , AS *\ 

6 o +• ci t? u ■ ( Quadratic program (Q4) *) 

proach as in Section 5.1. lor each i we solve the re- , \ 

striction of the Multi-Ssbo problem on the zth see- maximize ccj (^j=i J2k=i Vi,j,k x j,k j 
nario, i.e., the quadratic program (Q4) as shown in subject to 

Fig. 5, and then take the best of these solutions. It is ai ( ^ n _ 1 Ylk—i w i j k x j k ) < Bi 

easy to see that an optimal solution of (Q4) satisfies W w • ^ <r 1 

r o i Vi -J- n ' z^fc=i — 1 

a,- = min < 1 , — v^ 2 \ ■ For any fixed value of n < n < 1 

I ' E J= i Efc=i w i,j,fe J J u - - 1 

aj, (Q4) is known in the literature as the multiple- Vl<j<nVl<fc<s: < Xj t k < 1 

choice Knapsack problem with sn objects divided 

into n classes and a knapsack capacity of Bi/af, a Figure 5: Multi-Ssbo restricted to the ith 
0(l)-approximation algorithm for this problem that scenario, 
runs in O (ns 2 ) time is known [19]. 

We next show that algorithms for the single-slot case can be used for the multi-slot model with 
appropriate multiplicative factors in the approximation ratio. 

Lemma 3. There exists a 0(s k log A) -approximation (respectively, O (s log 2 (m + n) k log A)- 
approximation) algorithm for Frac-Multi-Ssbo (respectively, Int-Multi-SsboJ. 

Proof. We first prove our claim for For Frac-Multi-Ssbo. Consider the quadratic program (Q2)' 
obtained from the quadratic program (Q2) for Frac-Multi-Ssbo by removing the constraints 
^2k=i x j,k ^ 1 f° r 1 < J < If OPT and OPT' are the optimal values of the objective functions 
of (Q2) and (Q2)', respectively, then obviously OPT' > OPT. A straightforward inspection shows 
that (Q2)' can be written down in the same form as (Ql) with sn variables and m constraints. 
Thus, using the already proven result of Theorem l(i) we obtain a solution for (Q2)' whose 
objective value is —■ , — 0PT ' . — r = ? PT 1 > ? PT A To convert this to a solution of Frac-Multi- 

J Klog^maXj^ djk) KlogA — KlogA 

SSBO(i.e., to satisfy the constraints Ylk=i x j,k < 1 f° r each j) we divide each Xj t k by Ylk=i x j,k 
which decreases the total payoff by no more than a factor of s. 

The result for Int-Multi-Ssbo follows by translating the above worst-case approximation 
bound for Frac-Multi-Ssbo to a worst-case approximation of Int-Multi-Ssbo via the following 
lemma. 

Lemma 4. (Approximating Int-Multi-Ssbo via Frac-Multi-Ssbo,) Suppose that we have 

a r] -approximation for Frac-Multi-Ssbo. Then, we also have a 0(7/7) approximation for Int- 

: los m tf s — 1 

Multi-Ssbo where 7 = < 2 , ' 



log (m + n), otherwise 

Proof. For a particular value of the vector a = (ai, a 2 , ■ ■■ , a m ), (Q2) reduces to a linear program 
on the variables x = (xi,x 2 , ■ ■ ■ ,x n ). For ease of description, we consider the case of s = 1 first 
(i.e., the case of Frac-Ssbo). An inspection of (Ql) reveals that this linear program has exactly 
n variables and m inequalities, where the ith inequality Di (for 1 < i < m) is of the form: 

d f I H 

Dj — ai y^WiiXi I < Bi 



Consider a solution x^ = (x{, x 2 , . . . , xh). and a? = (a{, a 2 , . . . , a4), of Frac-Ssbo with 
^ = YliLi zCj=i a iVi,3 x j as ^ e va l ue of its objective. We may assume that C > 100 In m since oth- 
erwise the approximation guarantee can be trivially achieved. We employ the following randomized 
rounding scheme to transform this solution to a solution of Int-SsbO: 
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f f f 

• For i = 1,2, ... ,n, we round x\ randomly to and 1 with probabilities x\ and 1 — x \ , 
respectively. Let Xi € {0, 1} be the resulting random variable. 

a f 

• We return x = (xi, X2, . . . , x n ) and a = (ai, tt2, ■ ■ ■ , ct m ) as our solution where a, = 100 — 
for 1 < i < m. 

Let £' = YliLi S?=i a iyi,j x i be the new value of the objective and let Si be the event that inequality 



.Dj holds for this randomized solution. By linearity of expectation K[C] = 1QQ lnm 
inequality Di, and let a[ = B a ^ ± . By linearity of expectation, 



E 




1 



J 



100 In m Bi + 1 



< 



1 



. Consider the 



5, 



100 In m -Bi + 1 



Each random value cJ i WijXj can be thought of as an independent Poisson trial with a probability 
of success (i.e., a value of 1) as a[ WijXj. Thus, using standard Chernoff bound [20, Excercise 4.1], 
we get: 



Pr[£j does not hold] = Pr 



n < 1 w i,j x 3 J > 

i=i 



Pr 




< e - 31nm < 



In a similar manner, one can show that Pr [C < 2 oof nm ] < m- Thus, finally, using union bounds, 
we get 



Pr 



200 lnm 



/\{AT=i£i holds) 



> 1-Pr 



200 lnm 



^Pr [Si does not hold] J > 1- — 



Thus, we achieve the desired approximation bounds with 1 — o(l) probability. 

For the case of s > 1 (i.e., Frac-Multi-Ssbo), the same approach with some modifications 
work. In a nutshell, we have n additional constraints Fj (for j = 1, 2, . . . , n) of the form ^]fe=i x j,fc < 
1. Thus, the total number of inequalities/equalities is m + n and we need to do the analysis with 
"ln(n + m)" replacing "lnm". The only additional part that needs to be done is to show how to 
handle the Fj constraints. Notice that the set of variables involved in Fj are disjoint from the set 
of variables in any other Fj/ for j' / j. After rounding, we have Ylk=i x 3,k — 1001n(m + n). We 
now select one of these variables Xj 1 to Xj jS , say xjj, such that Xj^ = max!<fc< s {^™ 1 aiXj^Uij^ }, 
set Xj ; £ = 1 and set Xj^ = for k ^ t. After all these normalizations, we loose an additional factor 
of 1001n(m + n) and all constraints are satisfied. □ 

Note that the claim in Lemma 4 is "pessimistic" in nature; indeed, as our claim in Theorem 1 
shows, for arbitrary parameter range both Int-Ssbo and Frac-Ssbo can be approximated to 
within the same ratio. □ 



6 Approximation- hardness Results for SSBO and Multi-Ssbo (main 
result (R2)) 

6.1 Approximation-hardness Bounds for Ssbo 

Theorem 5 (Logarithmic inapproximability) . There exists instances o/Int-Ssbo and Frac-Ssbo, 
with n keywords and m = n scenarios each with equal probability, such that, unless ZPP = NP, any 
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polynomial-time algorithm for solving these problems must have an approximation ratio of any one 
of the following: 

• $7 (m 1_e ) (and, thus, also Q (n 1_£ ) ) , or 

• $7 (k log 1_£ d n ) . 

where < e < 1 is any constant. 

Proof. We construct instances of Ssbo with n keywords and m = n scenarios such that, for 16 
any n and any values of Cij in the range [dj,ndj), the claimed lower bound holds. We use the 
reformulation of Frac-Ssbo and Int-Ssbo as a bipartite quadratic program (Q2) as discussed in 
Section 4. 

The standard maximum independent set (MIS) problem is defined as follows. We are given an 
undirected graph G = (V, E). A subset of vertices V C V is called independent if for every two 
vertices u, v G V we have {u, v} $ E. The goal is to find an independent subset of vertices of 
maximum cardinality. It is known that MIS cannot be approximated to within a factor of l^l 1 ^ 6 
for any constant < e < 1 unless ZPP= NP [15]. 

For notational simplicity, let n = |V| and a = n 12 . Set m = n. Select an arbitrary order 
vi, V2, ■ ■ ■ , v n of the vertices in V. Intuitively, the ith column and the (n + 1 — i)th row of Y 
correspond to the vertex Vi and the entries of the matrix Y are such that they are above the 
reverse diagonal and encodes the adjacency of vertices of G on or below the reverse diagonal. 
Formally, 

if i + j < n + 1 
if i + j = n + 1 

if i + j > n + 1 and {v n -i+i, Vj} € E 
if i + j > n + 1 and {v n - i+ i, Vj} E 



Vij 



Fix d±, d2, ■ ■ ■ , d n as d\ = 1 and di = a di-\ for 1 < i < n. Thus, > > n e if ji > 72- Let 
Bi = Ci in+ i-i for 1 < i < m = n. Remembering that Wij = Cij yij for all i and j, we have: 

'0 if i + j < n + 1 

or if i + j > n + 1 and {v n -i + i, v j} $ E 
if i + j = n + 1 

or if i + j > n + 1 and {t> n _j + i, Wj} € E 1 



Note that n x ~ £ = m}~ e = O (k log 1_£ dn) since d n = n 12n and k = poly (log(m + n)) = poly (log(n)). 
Let Aj n( j and Aqi be the maximum number of independent vertices in G and an optimal value of 
the objective of the fractional or integral version of (Ql), respectively. 

Lemma 6. Aqi > A in( j. 

Proof. Consider an optimal solution V' of MIS on G with |V'| = Ai n d. We generate a solution of 
(Q2) by setting 

1, if vi £ V 



0, otherwise 



Note that, since V is an independent set, if % + j > n + 1, G V and {vi,Vj} 6 E then Vj V 
and thus Xi = a n -i + \ = 1 and Xj = a n -j + i = 0. 



16 Remember that in Section 2.5 we fixed bounds on k, namely, k — O (poly(log(m + n))). 
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First, we show that this is indeed a valid solution of (Ql). For any 1 < i < n — 1, consider the 
constraint 

( n \ 

If a ra _j + i = 0, then the constraint is obviously satisfied since -B n _j+i > 0. Otherwise, a„„j + i = 
= 1 and thus, 

(n \ n 

^fn-i+lj I = ^^^n—i+ljXj = Cj + ^ ^ ^n— x j = Cn+l—i,i = -Bn+l—i 
j=l J j=l i+j>n+l 

{vi,Vj}&E 

Thus, all the constraints are satisfied. Finally, the value of the objective function is 

m n 

Y Y ai x i Vi >i = Y aiX i = Y x i = Aind 

i=l j=l i+j=n+l v j£V 

Vj eV 

and thus Aqi > Ai n( j. □ 
For the other direction, we first need a normalization lemma. 

Lemma 7 (Normalization lemma). Consider an optimal solution of (Ql) with an objective value 
of Aqi. Then, we can transform this solution to another solution of (Ql) of objective value Aq X 
such that: 

(a) Xi G {0, 1} for each i; 

(b) A' Q1 > Aqi - 1; and 

(c) if {xi, Xj} G E then x,- L + Xj < 1. 

Proof. Suppose that we are given an optimal solution of (Ql) with an objective value of Aqi. 
First, we note some properties of this solution. 

Proposition 3. The following statements are true: 

(i) for every i, a n -i + \Xi < 1, and 

(ii) for every i and j, ifi + j>n+l and {vi, Vj} G E then ctjXi < n~ % . 

Proof. Consider the constraint a n ^ i+ i {^2™ =1 w n -i+i,jXj} < B n _ i+1 = c n - i+1>i . 
Since w n - i+ i ti = c n _j +M , (i) follows. 

(ii) is equivalent to the claim that a n -i + \Xj < n -6 if j > i. Since ^ > n 6 if j > i (for any p 
and q), (ii) follows. □ 

Now we show how to "normalize" this solution such that each variable X{ is or 1, and the total 
objective value does not decrease too much. Let T = J2i+j^ n +i a i x jyi,j- By Proposition 3(ii), 
r < n 2 x ra -6 = n~ 4 . Thus, setting $ = Y.i+j= n +i a i x jUi,j> it follows that < Aqi < $ + n" 4 . 
Thus, subsequently we concentrate on the quantity <J>. 

If ctn-i+i = for some i, then we can set Xi = without changing the value of $. Let 
I = {n — i + 1 1 a n -i + \ > and X{ > 0}. Consider the largest index n — i + 1 G /. There are two 
cases to consider: 
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Case 1: xi > n 3 . By Proposition 3(i), a n _j + i < n 3 and a n -j + ±Xj < a n -j + i < n 3 for every 
j > i such that {vi,Vj} G -E. 

We set a n -i + i = Xi = 1 and set = a n -j + \ = for every j > i such that Vj} £ E. The 
change in <I> is at most n x n~ 3 = n~ 2 . 

Case 2: Xi < n~ 3 . We set a n -i+i = xi = 0. The change in $ is at most n~ 3 . 

We now remove the index n — i + 1 from I and continue with the next largest index. We continue 
until 7 = 0. Since |/| < n, the total change in $ is at most n _1 < 1 — n~ 4 . 

To complete the proof, we select vertices Vj in the independent set if Xj = 1. □ 

To finish the proof of Theorem 5, we simply select those vertices v j for the independent set such 
that Xi = 1. We have now shown that A ilK } < Aqi < A ilK j — 1. Thus, since A ilK j and Aqi are 
within a constant factor of each other and A in( j cannot be approximated to with a factor of n 1_e for 
any constant < e < 1, Aqi cannot be approximated to within a factor of Q (n 1_e ), or U (m 1_e ), 
or n (Klog 1 ^ d n ) . □ 

6.2 Approximation Hardness Results for Multi-Ssbo 

A first natural approach for this would be to generalize the approximation hardness result for the 
single-slot case (Ql) in Theorem 5 to the multi-slot case (Q2). This can be trivially done by 
copying the construction of the single-slot case to one of the slots in the multi-slot case. However, 
after this, one can observe that: 

the construction for the single-slot case cannot again be copied to another slot because of 
the constraints in Equation (2) which states that at most one selection variable in each slot 
can be set to 1. 

Formally, the lower bound construction for (Ql) can be extended to (Q2) as follows: 

• Identify y itjj i of (Q2) with y itj of (Ql) and set y itjt2 = Ui,j,3 = ■■■ = y itjiS = in (Q2). 

• Identify c^i of (Q2) with c itj of (Ql) and set Cij t2 = Cj,j,3 = • • • = (Hj tS = in (Q2). 

• Identify Xj t k,i of (Q2) with Xj of (Ql). 

This leads to the following approximation hardness result. 

Corollary 8. There exists instances of Int-Multi-Ssbo and Frac-Multi-Ssbo, with n key- 
words, m = n scenarios each with equal probability and s slots, such that, unless ZPP = NP, any 
polynomial-time algorithm for solving these problems must have an approximation ratio offl (n 1_e ) 
or Q (re log 1 ^ e where < e < 1 is any constant. 

However, note that the lower bound in the above corollary does not involve s. This is 
unfortunate, since one would expect that the asymptotic complexity of Multi-Ssbo problems 
should depend on the number of slots s, especially when s is large. So, a natural question to ask is: 

can we obtain an approximation hardness result for Multi-Ssbo problems that shows 
dependencies on s? 
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Our next result in this section shows that one can prove a "super poly-logarithmic but sub- 
polynomial" bound 17 of 2 log e ( ns ) that depends on s. The bound is not completely satisfactory 
because of its sub-exponential nature, but nonetheless provides evidence that the approximation 
hardness of Multi-Ssbo problems depends strongly on s. How do the two bounds n l ~ £ and 
2 log 1 - £ (n S ) compare ? Note that: 

• Since 2 logl £ ( ns ) = o ((ns) s ) for any constant 5 > 0, n l ~ £ dominates 2 logl £ ( ns ) if s is not too 
large compared to n, e.g., s = 0{n). 

• On the other hand, 2 logl £ ( ns ) dominates n l ~ £ if s is large compared to n, e.g., s = 0, (n logn ). 
Thus, neither bound is subsumed by the other for all values of parameters. 

Theorem 9 (Inapproximability of multi-slot case). 

(a) (dependence on number of slots) There exists instances of Int-Multi-Ssbo and Frac- 

Multi-Ssbo with n keywords, s slots and k = 1 such that, unless NPC DTIME(n poly ( log ( n ^) ; 

any polynomial-time algorithm for solving these problems must have an approximation ratio 
O j 2 iog 1 -(ns) where 0<£< i 

is any constant. 

(b) Int-Multi-Ssbo is MAX-SNP-hard for s = 2 even when k = 1 and Cj^ = 1 for all j and k. 
6.2.1 Proof of Theorem 9 (a) 

A schematic diagram of the entire reduction is shown below. 



L G NP 



[ii] 



I G L 
I 



Maxrep 



-»■ < 



= h 
h 



Grouped 
Compatibility 



2log 1 '(ns) 



< 



= h 
h 



MULTI-SSBO 

(Q2) 



> h 



2 log 1 -' ) (ns) 



1 1 

+ — + 



2 log 1 e (ns) ns ( ns ) 



Our beginning is a one-round two-prover system on which the parallel repetition approach 
applies. It would be more convenient to describe the problem in a graph-theoretic form as the 
Maxrep problem [18]. We are given a bipartite graph Q = (A,B,E) with = \B\. Also is given 
a partition of A into n equal-size subsets Ai, A2, • • • , A n , each with s elements, and a partition of B 
into n equal-size subsets B\,B<i, ■ ■ ■ ,B n , each having s elements (and, thus, |^4| = \B\ = ns). These 
partitions define a natural "bipartite super-graph" % in the following manner. % has a "super- 
vertex" for every Ai (the left partition) and a "super- vertex" for every Bj (the right partition). 
There exists a "super-edge" between the super-vertex Ai and the super-vertex Bj if and only if 
there exists u G Ai and v G Bj such that {u, v} is an edge of Q. It is also given that % is a d-regular 
graph for some d. Thus, the number of super-edge h is given by h = dn. A pair of nodes u and 
v "witnesses" a super-edge {Ai,Bj} provided u G Ai, v G Bj and the edge {u, v} exists in Q. A 
set of nodes S of Q witnesses a super-edge if and only if there exists at least one pair of nodes in 
S that witnesses the super-edge. In Maxrep, we are supposed to select a single vertex from each 
Ai and a single vertex from each Bj. The goal of Maxrep is to maximize the size of its solution, 
namely the number of super-edges witnessed. The following result follows from [11] (see also [18] 
for a self-contained description). 

17 giog 1 E a is u (poly (log a)) and o (a s ) for any constant S > 0. 
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Theorem 10. [11, 18] Let L G NP and < 5 < 1 be any fixed constant. Then, there exists a 
reduction running in quasi-polynomial time, namely in time n poi), ^ ogn \ that given an instance I of 
L produces an instance I' of Maxrep such that: 

• if I G L then I' has a solution of size h; 

• if I 4L then V has every solution of size at most t—t, — r- 

Thus, the above theorem provides a 2 log ( ns )-inapproximability for Maxrep under the complexity- 
theoretic assumption of NP % DTIME(nP°'y( lo s n )). 

Let L be any language in NP. We first use the above theorem to translate an instance I of L 
to an instance I' of Maxrep as described. We next translate this instance I' to an instance of 
an intermediate problem, which we call as the Grouped Compatibility (Group-Compatibility) 
problem, that we define next. 

For notational convenience, we define two mappings a and tt such that a(u) = Ai and ir(v) = Bj 
if u G Ai and v G Bj, respectively. The Group-Compatibility problem is derived from Maxrep 
as follows. Given an instance of Maxrep as described above, we construct a Q' = (V',E') in the 
following manner. V' has a group V[ j of s 2 vertices for every pair Ai and Bj] thus there are n 2 
such groups. There is a vertex u v in V[ ■ for every pair of vertices u G Ai and v G Bj. The weight 
w(u v ) of such a vertex u v is set as follows: if {u, v} G E then w{u v ) = 1 else w{u v ) = 0. The edges 
of Q' are defined as follows. For two vertices u v and u' v ,, the edge {u v ,u' v ,} exists in Q' if and only 
if exactly one of the following two conditions (*) and (**) are satisfied: 

(*) a{u) = a(u') and u ^ u'; 

(**) ir(v ) = 7r(V) and v ^ v' . 

The goal of Group-Compatibility is to select at most one vertex from every group V[ j such that 
no two selected vertices are adjacent and the size of the solution, namely the sum of weights of 
selected vertices, is maximized. One could translate a lower bound for Maxrep to a lower bound 
for Group-Compatibility preserving inapproximability in the following manner. 

Lemma 11. Maxrep has an optimal solution of size y if and only if Group-Compatibility has 
an optimal solution of size y. 

Proof. Consider a solution of Maxrep of size y. We select the set of vertices u v as a solution of 
GS for every pair of vertices u G A and v G B selected by Maxrep. That this is indeed a valid 
solution for Group- Compatibility can be seen as follows. Since Maxrep selects exactly one 
vertex from Ai and every Bj, for any two selected vertices u v and u' v , if cr(u) = o~(u') then u = u' 
or if ir(v) = ir(v') then v = v' , and moreover if Maxrep selects u G Ai and v G Bj then only the 
vertex u v is selected from the group V-j. For each pair u G Ai and v G Bj selected by Maxrep, 
the pair contributes 1 to the size of Maxrep if and only if {u, v} G E which holds if and only if 
w(u v ) = 1. Thus, the size of the Maxrep solution is identical to that of Group-Compatibility. 

For the other direction, consider a solution of size y of Group- Compatibility. For every 
vertex u v in the solution of Group-Compatibility, we select the vertices u G A and v G B in the 
solution of Maxrep. Now, we note the following. If Group- Compatibility selects two vertices 
u v and u' v , then either 

• a{u) 7^ cr(u') and 7r(t>) ^ vr(f'); or 

• a{u) = <r{u') and u = u'; or 
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• n(v) = tt(v') and v = v'. 

In all cases, at most one vertex from each partition of A or B are selected. 

Now, suppose that the above solution of Maxrep did not select a vertex in some partition of 
A U B, say Ai (the case of Bj is similar). This means the solution of Group- Compatibility did 
not include a vertex u v for any u G Ai and any v G B. Pick any v G B such that x v is in the solution 
of Group-Compatibility for some x G A (if Group- Compatibility does not contain any such v 
pick a v G B arbitrarily). Pick any u G Ai and suppose that we add this vertex u v to the solution of 
Group-Compatibility. It is easy to see that u v is not adjacent to any other vertex in the solution 
of Group-Compatibility thereby increasing the size of the solution of Group-Compatibility 
by 1 and contradicting the optimality of the solution of Group- Compatibility. 

Finally, if w(u v ) = 1 for a vertex u v in the solution of Group- Compatibility then both 
vertices u and v are added to the solution of Maxrep and since {u, v} G E the pair of vertices 
(u, v ) witnesses one more super-edge. Thus, the size of the solution of Maxrep is precisely y. □ 

For any constant < S < 1, let < e < 1 be the corresponding constant such that (log 2 (n 2 s 2 )) 1_£ 
(log 2 (ras)) 1-5 . Combining Lemma 11 and Theorem 10, the following lemma follows. 

Lemma 12. Let L G NP and < e < 1 be any fixed constant. Then, there exists a reduction 
running in quasi-polynomial time, namely in time n poly ( log ™\ that given an instance I of L produces 
an instance I' of Group-Compatibility with n groups of vertices, each group having s vertices, 
such that, for some h: 

• if I G L then V has a solution of size h; 

• if I L then I' has every solution of size at most /i/2 logl e ( ns ). 

Now, we provide a translation of the instance I' of Group-Compatibility as to an instance 
/" of Multi-Ssbo using the quadratic programming reformulation (Q2). Let Q = (V,E) be 
the instance I' of Group-Compatibility with a given partition of V into n groups of vertices 
Vi, V2, • • • , V n where |V«| = s. Let u^i, 11^2, • • • , Uj iS be an arbitrary ordering of the s vertices of Vi 
and let vi, V2, ■ ■ ■ , v ns denote the ordering 

u l,l , u l,2 , • • • , Ul jS , U2,l , U2,2 , ■ ■ ■ , U2, s , , U ni i , U raj 2 , • • • , Un,s 

of vertices. We construct an instance /" of Multi-Ssbo in the following manner. For every Vi, 
there is a keyword Ki with s slots with aj = (ns) 6< ^ 1 ) s+6j . Intuitively, we associate the jth slot 
of K-i with the vertex Ujj and scenarios 1 through m = ns correspond to an enumeration of vertices 
of Q in the order v ns ,v ns -i, . . . ,v±. The entries yij,k of the profit matrix Y and the quantities 
Bi, B2, . . . , B ns in (Q2) are specified as follows. Let ns — i + 1 = (f — l)s + k' for some integers 
1 < f < n and 1 < k' < s. Then, 

3 < f 

j = j' and k < k' 
j = j' and k = k' 
j = j' and k > k' 
j > / and {iij',k', Uj,fe} G E 
j > f and {uf :k/ ,u j:k } E 

and Bi = Cj/^i = (ns) 6 ^' _1 ^ +6fe '. It can be shown that the reduction produces the desired 
inapproximability by proving the following lemma and noting that (ns) -1 + (ns) -4 < 1. 



Ui,j,k 






1 
1 



] 
i 
i 
i 
i 
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Lemma 13. The following two statements are true. 

(a) //Group- Compatibility has a solution of size h, then Multi-Ssbo has a solution of size at 

least h. 

(b) // Group- Compatibility has no solution of size more than fo/2 logl_s ( ns ), then Multi-Ssbo 

has no solution of size more than r — — - -| h - — — . 

2 l °s ( ns ) ns (ns) 4 

Proof. 

(a) Consider a solution of I' of size h and renumber the indices, without loss of generality, such 
that if a vertex is picked from the partition Vj it is vertex Uj i. Let S be the set of vertices selected 
in the solution of I'. Now, for every j such that the vertex u^i G S, we set x^\ = a ns _(j_i^ s = 1. 
We first calculate the total payoff, i.e., the objective value of (Q2), obtained by this solution. 

Consider the quantity payoffj = ctj {Y^j=i J2k=i x j,kUi,j,kJ ■ If i 7^ ns — (j' — l)s for some 1 < f < s, 
then cij = and thus payoffj = 0. Now suppose that i = ns — (j' — l)s; thus ns — i + 1 = (j' — l)s + l. 
Since S is a solution of Group-Compatibility, if Xj^xy^ £ S then {uj,i, u^i} E. Thus, 
payoff^ = ctiXyXDij'A = w ( u j',i)- Consequently, payoff = J2i payoff^ = J2 U ., ie swK 7 ,i) = h. 

Next, we show that all the constraints of (Q2) are satisfied. The constraint Yl k =i x j,k < 1 is 
obviously satisfied for every j. So, we consider, for an arbitrary i, the constraint 



n s 



K j=l k=l 

If i ns — (j' — l)s for some 1 < j' < s, then a« = and the constraint is obviously satisfied. So, 
assume that i = ns — (j' — l)s for some 1 < f < s. Then, 



n s 



ai 



2^ 2^ Vi,j,k Cj, k Xj tk ] = w(Uj Cj /,l < Cj /,l = Bi 

j=l k=l 



(b) We need to show that a "small-size" solution of I' must imply a "similar small-size" solution 
of I". Since a solution of I" allows fractional values of Xj :k s, we first need a few normalization 
procedures which changes the objective value of any solution of (Q2) for the instance I" only by 
a o(l) additive factor. 

Lemma 14. Consider any 1 < i < ns and let ns — i + 1 = (j' — l)s + k! for some 1 < f < n and 

wl/ x „, . f w(uj k) if j = j' and k = k' 

1 < k' < s. Then, diVi ik x ik < S , s i , 
- - ' tyhhk hk ~ \ (ns)" 6 otherwise 

Proof. Suppose that j = f and k = k'; thus y^ k = w ( u j,k)- If w(u jtk ) = then a; y itj ^ %j,k = = 
w (u^fc). If w(uj^k) = 1 then, because of the constraint Qj (Y^ =1 Sfc=i Vi,j,k c j,k x j,k^ < Bi, we must 

have anyi,j,kCj,kXj,k < Bi. Thus, aiy i;j>k x j;k < — = 1 = w(u j>k ). 

Cj,k 

If j < f or if j = f and k < k! or if j > f and {uf jfe , \i jl k ,} ^ E then y itjik = < (ns) -6 . 
If j > j' and {u jtk , Uji^t} £ E then, because of the constraint a« (Yl]=i Yl k =i Vi,j,k Cj,k x j,kj < 
Bi, we must have aiyij tk Cj :k Xj jk < Bi. Thus, 

<*iVij, k Xj, k <^L = 2m = (ns) 6( J -'-l).+6fc'-6C J --l)-6* = (ns) 6( J -'- J >+6(fc'-fc) < 1 



Cj, k <\,.h ' (ns) 6 

□ 
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Let A = ^2aiX jtk yij,k and V = ^ aiX jyk y iJyk ; thus 

ns— l)s+fc ns— i+l=(j — l)s+k 

m Ins \ 

payoff = ^ payoffj = on I ^ ^2 yi >i' k x ^ k = r + A - 
i=i \j=i k=\ ) 

We first show that A = o(l). 
Lemma 15. A < (ns)~ 4 . 

Proof. Since 1 < i < ns, 1 < j < n and 1 < k < s, using Lemma 14 we get A < (ns) 2 /(ns) 6 = 
(ns)" 4 . □ 

Thus, we now concentrate on the quantity T. 

Lemma 16 (Normalization lemma). Given a solution of I" of Multi-Ssbo, it is possible at alter 
the solution in polynomial time such that if F' is the new value of F after the alteration then: 

(normalization within each slot) for each 1 < j < n Xj :k € {0, 1} and Yl k =i x j,k < 1/ 

(compatibility between slots) if x^ k = Xj^ k > = 1 then {Vj, k , Vj^ k >} E; 

(small loss in payoff) T' > T — (ns) -1 . 

Proof. In the sequel by triplets of indices i, j and k we mean a triplet such that ns — i + 1 = 
(j — l)s + k. If ai = or yij jk = w(uij) = then we can set Xj >k = without changing the value 
of T, so we assume that this is not the case. 

Let j be the largest index such that < x^ k * < 1. Among all such k"s let k be the largest 
index such that < Xj ;k < 1. If there is no such pairs of indices, our normalization procedure ends. 
Otherwise, by Lemma 14, ctiyij tk Xj ;k = aiXj tk < 1. We have the following cases. 

• If Xj :k < (ns) -3 , we set Xj :k = 0. The resulting change in F is at most (ns) -3 . 

• Otherwise x^ k > (ns) -3 . By Lemma 14, we have: 

(a) for k! < k and i' such that ns — i' + 1 = (j — l)s + k', we have ay Vi\^ k ' Xj ;k = ov Xj tk < 
(ns) -6 . Since Xj :k > (ns) -3 this implies ay < ay Xj :k < (ns) -3 . Thus, if we set Xj jk > = 0, 
the change in F due to this is at most ay yi>j :k > Xj :k i < ay < (ns) -3 . 

For each k' < k we set Xj jk > = 0. The net change in F is at most (ns) -2 . Notice that 
after this change Xj ;k = 1 and Xj jk > = for every k' ^ k. 

(b) For f < j and k' such that {uj^ k , Uj> jk /} G E, let i' be such that ns — i' + l = (j' — l)s + k f . 
Then, we have ay yyj jk Xj jk = ayx^ k < (ns) -6 . Since Xj jk > (ns) -3 this implies ay < 
oty Xj :k < (ns) -3 . Thus, if we set Xj/ >k > = 0, the change in F due to this is at most 
ay Vi',j', k ' x jl;k > < ay < (ns) -3 . 

For each such j' and k' as described above we set Xj> jk ' = 0. The net change in F is at 
most (ns) -2 . Notice that after this change if xy^i > then {uj ik , Uj' ik >} ^ E. □ 

We repeat the normalization procedure until it cannot be applied anymore. Note that both 
Step (a) and Step (b) above executes at most n times. Thus, the total change in F is at most 
(ns) -1 . To complete the proof, we select vertices Vj tk in our solution if Xj jk = 1. □ 
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6.2.2 Proof of Theorem 9 (b) 

We reduce the MAX-2SAT-5 problem 18 to our problem. MAX-2SAT-5 is defined as follows. We are 
given a collection of m clauses C\ , C2 , • • • , C m over n Boolean variables Z\ , Z2 , ■ ■ ■ , z n , where every 
clause is a disjunction of exactly two literals and every variable occurs exactly 5 times (and, thus, 
m = 5n/2). The goal is to find an assignment of truth values to variables to satisfy a maximum 
number of clauses. This problem was shown to be MAX-SNP-hard in [10]. 

Given an instance of MAX-2SAT-5 we create an instance of Int-Multi-Ssbo (i.e., (Q2)) with 
s = 2 as follows. Every variable zj corresponds to a keyword tCj with two slots. The variables Xj t \ 
and Xj t 2 encode the truth assignments of the variable Zj with Xj t \ = 1 indicating that Zj is true and 
Xj : 2 = 1 indicating that Zj is false; we will say that Xj t i and xj^ are the slots corresponding to the 
literals Zj and ->Zj, respectively. There are exactly m scenarios, each with probability ^, defined 
in the following manner: 

• Bi = 1 for 1 < i < m. 

• Cj^k = 1 for 1 < j < n and 1 < k < 2 = s. 

• For the ith. clause Cj containing two literals, we have the ith scenario of the following form. 
Let Xj^ and Xy^i be the slots corresponding to the two literals of the clause. Then we set 
Vi,j,k = Vi,j',k' = 1, and y ijjik = if j ^ j' or k / k! . For example, if d = z 2 V (-12:3) then 
Vi,2,i = Vi,3,2 = 1 and y itj ^ = for all other j and k. 

An inspection of the construction reveals that it satisfies the following: 

• Because this is an instance of Int-Multi-Ssbo, by Equation (2), for every 1 < j < n, either 
Xj t i = 1 or Xj t 2 = 1 but not both. On the other hand, it is always possible to set at least one 
of the two variables Xj t \ = 1 or Xj t 2 = 1 without increasing the total payoff. Thus setting 
these variables correspond to a truth assignment. 

• A scenario contributes a payoff of 1 if and only if at least one of two slots have been selected. 
Thus, contribution of a scenario correspond to satisfying a clause. 

By the above observations, we satisfy m! clauses if and only if the above instance of Int-Multi- 
Ssbo has a total payoff of m'. 

7 Other Results 

7.1 Improved Algorithms for Special Cases of Ssbo and Multi-Ssbo 

By the phrase "within an additive error of 5" in Lemma 17 we mean that if our solution 
returns an objective value of x when the optimal value is y then \x — y\ < 5. 

Lemma 17. 

(a) (Fixed number of scenarios) Ifm is fixed, Frac-Multi-Ssbo admits a pseudo-polynomial 

time solution with an absolute error of 5 for any fixed 5 > 0, Int-Ssbo admits a pseudo- 
polynomial time O(l) -approximation and Int-Multi-Ssbo admits a pseudo-polynomial 

time 0(log 2 n)- approximation. 

18 Our reduction approach should also work if we start with MAX-2SAT-fe for any constant k. 
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(b) (Fixed number of keywords) // ns is fixed, then Frac-Multi-Ssbo admits a pseudo- 

polynomial time solution with an absolute error of 5 for any fixed 5 > 0. 

(c) (Logarithmic number of keywords) if ns = O(logm) then Int-Multi-Ssbo admits a 

polynomial time exact solution. 

(d) (Fixed number of scenarios and polynomial bids) If m is fixed and the maximum size 

of all the numbers, namely max < maxjyjj^}, max{i?j}, max < — 1 , maxjcjj^} 1, is at most 

1 i,j,k ' ' i i I Si J i,j,k ' ' J 

poly(n) then Int-Multi-Ssbo admits a polynomial time solution with an absolute error of S 
for any fixed 5 > 0. 

Proof. 

(a) and (b) We prove part (a) as follows (the proof for part (b) is similar). Consider the Frac- 
Multi-Ssbo problem; let y = max {yij,k}- 

l<i<m ' ' 

l<j<n 

l<k<s 

Proposition 4. Let a* = (aj, a* 2 , . . . , a*J andx* = (x*^, . . . , x* ljS , x^, . . . , x^ s , X,J 

be the solution vectors for an optimal solution of value Efpayoff *] = YliLi a i (Sj=i Ylk=i Vhj,k x j k) ■ 
Suppose that we approximate the vector a* by a vector a £ = (a± t£ , . . . , a m>e ) such that \ a* —a,- he \ < e 

for each i. Then, if e < we can compute a solution with a total expected payoff of at least 

nsy 

E [payoff] - 5. 

Proof. Our algorithm is simple. Plugging the values of this a £ in (Q2) reduces it to a linear pro- 
gram, which can be solved optimally in polynomial time giving a solution vector, say x e . Our 
solution vectors are a £ and x e . Obviously, all the constraints are satisfied, so we just need 
to check the total expected payoff of our solution. For notational convenience, let F(cc,x) = 

YhLi a i (S"=i ELi Vi,3,k x j,k^j for two vectors x = (x ltl , . . . , xi jS , x 2 ,i , • • • , x 2 , s , , x nA , . . . , x n , s ) 

and a = (cci, . . . ,a m ); thus F(q*,x*) = E [payoff*]. Then, 

n s 

I F(a*,x*) - F(a e ,x*)| < e ^^Vi,j,k <ensy 

3=1 k=l 

=^ F(a e ,x e ) > F(o; e ,x*) > F(o;*,x*) — ensy > F(o;*,x*) — 5 

□ 

To get such a cx £ , for every a^ £ we try out all rational numbers between and 1 of the form 

2n Sy for j = 0, 1, . . . , — - — until we succeed. The total number of choices is at most y-^- + I J , 

which is pseudo-polynomial 19 in the size of the input since m is fixed. 

The result for Int-Multi-Ssbo follows by using the above proof with Lemma 4. 

(c) When ns = O(logm) then we can try out all possible poly(m) assignments of keywords to slots. 
For each assignment, we can directly calculate the values of ol\, CK2, • • • , ct m . We take the best of all 



The running time is not strongly polynomial since the input size depends polynomial on log 2 y (see Section 2.5). 
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such solutions. 



(d) Let pi(n) be a polynomial in n such that max < y, max{Sj} , maxju^fc} > < pi(n). By the 

[ i i,j,k J 

proof in part (a), to ensure an absolute error of 5, it suffices to try all vectors a = (a±, a%, . . . , a m ) in 
which each oti is a non-negative rational number with numerator and denominator at most P2{n) for 
some polynomial P2(n), and provide a solution of Int-Multi-Ssbo for this a in polynomial time. 
We will refer to Bi as the "expected budget" for the ith scenario. Let E[payoff (j, k, b\, ... , b m ) ] 
be the optimal value of the expected payoff when no slot was selected after the kth. slot of the jth 
keyword and the expected budget for the ith scenario was bi. It is easy to see that the following 
recurrence holds: 

( m 

E [payoff (j, k,bi, . . . ,b m )] = max <^ ^ y i>jjk +E [payoff(j - 1, s, &i - aiw^ jjk , ...,b m - a m w m ^ tk ) ] , 

I i=i 

E [payoff(j, fe — l,6i, . . . , b m ) ] j 

Based on the above recurrence, it is easy to design a polynomial time dynamic programming 
algorithm to compute the optimal solution E [payoff(n, s, B\, . . . , B m ) ] of Int-Multi-Ssbo. □ 

7.2 Limitations of the Semidefinite Programming Relaxation Approaches for 

SSBO 

A natural Semidefinite programming (SDP) 
relaxation approach to solve quadratic pro- 
grams such as (Ql), extensively used in 
existing literatures for efficient approxima- 
tions of quadratic programs for MAX-CUT, 
MAX-2SAT and many other problems [24], 
is as follows. We first add some redundant 
inequalities to (Ql). For every i and j 
we add the inequality ctiXj > 0. Clearly, 
this does not change the solutions of (Ql). 
Then, (Ql) can be relaxed to a vector 
program (V) by replacing the variables by 
(m + n)-dimensional vectors and the prod- 
uct of variables by the inner product (denoted by .) of the corresponding vectors. The resulting 
vector program is shown in Fig. 6; it is well known that (V) is a relaxation of (Ql) (e.g., see [24]). 

Since the lower bounds in Theorem 5 have e < 1 and thus leaves a "very small" gap between 
this lower bound and the upper bound in Theorem 1, one might wonder if the gap can be somewhat 
narrowed down by designing an approximation algorithm based on the S D P-relaxation approaches 
whose approximation ratio is, say, o (^j^^j or o ^ fcpofg ) ^ However, we show that the large 
integrality gap of the SD P-relaxation does not allow for such a possibility. 

Lemma 18 (Limitations of SD P-relaxation approaches). Let k = 1. Let OPTqi and OPT-y be the 
total optimal payoff for an instance of (Ql) and the optimal value of the objective function of (V), 

respectively. Then, > ^ = & ( . l °f dn ) . 

OPTqi 2 \\oglog d n J 



(* Vector program (V) *) 

maximize YT= l Y? j=P Vi,j U i ■ Vj 

subject to V 1 < i < m : YJj=\ c i,j Vi,jUi ■ Vj < B i 

Vl<i<m: Vl<j<n: W» . V,- > 

VI < i < m: UiMi < 1 

Vl<i<m: Ui € R m+n 

VI <j<n: Vj.Vj < 1 

VI <j < n: Vj £ R m+n 



Figure 6: S D P-relaxation of (Ql). 
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Proof. We reuse the notations and terminologies used in the proof of Theorem 5. Let the given 
graph G be a completely connected graph; thus Aj n( j = 1. We construct an instance of Ssbo 
as in Theorem 5. Thus, Aqi < 1 + A; n d = 2. Note that c n = d n = m 6m and thus m = 
0(logd n /log logd n ). 

However, we show that 0PT vcctO r > m - Let U±, . . . ,U m be a set of mutually orthogonal unit- 
norm vectors in M m+n and let Vj = lA m ^i+\ for 1 < i < m. Thus, Hi . Vj is 1 if i + j = m + 1 
and is otherwise, and U{ . U{ = Vj . Vj = 1 for all i. Obviously, YliLi SJ=i VijMi • Vj = m - We 
now verify that this is indeed a valid solution of (V) by checking that it satisfies all the constraints 

(SjLi w i,jUi ■ y?) < Bi ioi I < i < n. It can be seen that (j2]=i Wi,j Ui . Vj^j = w^ m ^ i+ i = 
Cm-i+l = B{. □ 



7.3 Combinatorial Dual of Ssbo Problems 
In DUAL-SSBO, the natural combinatorial 

dual version of Ssbo, we are given a lower (* Quadrat i c pro gram (Dual-Ql) *) 
bound, say P, on E [payoff]. Our goal is to min j m j ze B 

compute the minimum possible value of the subject to ^ m a x y > P 

budget B of the advertiser such that his/her l ~ l • J_1 3 l ' J / ~ \ 

total expected payoff is at least P. The VI < i < m: a, (£?=i^J^j <e l B 

dual version DuAL-MuLTi-SSBOof Multi- VI <i <m: 0<«j<l 

SSBOcan be defined in a manner analogous Vl<j<n: 0<Xj<l 

to that of DUAL-SSBO. DUAL-SSBO can 

be reformulated as the quadratic program Fi § ure 7: Quadratic program for Dual-Ssbo. 
(Dual-Ql) shown in Fig. 7. 

Obviously, Dual-Ssbo is NP-hard since Ssbo is NP-hard. For a given required expected profit 
V, let B-p be the minimum budget that achieves the expected total profit V . We define a bi-criteria 
approximation for Dual-Ssbo in the following manner: 

a (6, 7)-approximation for Dual-Ssbo, for 5, 7 > 1, is a solution that achieves an 
expected total profit of at least j with a budget of jB-p. 

Lemma 19. 

(a) (Inapproximability of Dual-Ssbo via inapproximability of Ssbo) 

• If Frac-Ssbo cannot be approximated to within a ratio of p > 1 for some parameter range, 
then Dual-Frac-Ssbo also cannot be approximated to within a ratio of p for the same 
parameter range. 

• If Int-Ssbo cannot be approximated to within a ratio of p > 1 for some parameter range, 
then Dual-Int-Ssbo also cannot be approximated to within a ratio of 200 ^ n — for the same 
parameter range. 

(b) (Bi-criterion approximation of Dual-Frac-Ssbo via Frac-Ssbo) // Frac-Ssbo can 
be approximated to within a ratio of p > 1 for some parameter range, then Frac-Ssbo has a 
(p, 1)- approximation in the same parameter range. 

Proof. Let E [payoff 6 ] be the optimal total expected payoff for Ssbo when the budget is B. For 
any constant A > 1, a solution of (Ql) with a budget of B is obviously also a solution of the 
same instance of (Ql) with a budget of AB. This implies E [payoff AB ] > E [payoff 8 ]. Let 
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P = YuT=iYJj=iVi,j and b = maxi<j< m | Y!j=i a i,j c i,j }; note tnat botn l°g2P and l°g2 ^ are 
polynomial in the size of the input (see Section 2.5). 

We prove (a) by contradiction. Suppose that some version of Dual-Ssbo has a /^-approximation. 
Consider an instance of the same version of Ssbo and suppose the budget is B. We do a binary 
search in the range of positive integers [l,p] in polynomial time with the approximation algorithm 
for Dual-Ssbo to find a?£ W-,p\ such that Bp^\ < pB but Bp > pB. Consider this solution of 
Dual-Ssbo and suppose that B* is the actual optimal value of the budget corresponding to the 
total expected payoff V. Thus, B* > > B and E [ payoff Bv \ > E payoff 6 * > E [payoff 5 ]. 
Suppose that we now divide every xi by p. This provides a valid solution of Frac-Ssbo with a 



if payoff 1 

total expected payoff of at least — -. By Lemma 4, from this valid solution of Frac-Ssbo 



e[ payoff 8 * 1 

one can obtain a solution of Int-Ssbo with a total expected payoff of at least 2QQ - lnm . 

To prove (b), suppose that some version of SSBO with a budget of B has a p-approximation 
algorithm. Consider an instance of the same version of Dual-Ssbo with a requirement of to- 
tal expected payoff of V and let B-p be the value of an optimal budget for this instance. Since 

^1 — — — E [payoff B+1 ] < E [payoff 5 ] < E [payoff B+1 ] , we do a binary search in the range 

of positive integers [1,6] in polynomial time with the p-approximation algorithm for SSBO to find 
aBe [1,6] such that ^ < E [ payoff 13 ] < pV + 1. Thus, this provides a solution of the Dual- 
Ssbo with a total expected payoff of at least ^ and a budget of at most Bp, giving the desired 
(p, l)-approximation in (b). □ 



8 Conclusion 

We have presented the first known approximation algorithms as well as hardness results for stochas- 
tic budget optimization under the scenario model. The scenario model is natural in many areas, 
and it is particularly apt for internet ad systems. We obtained our results by making the connec- 
tion between these problems and a special case of bipartite quadratic programs; we exploited this 
intuition crucially in both approximation algorithms and hardness proofs. These class of quadratic 
programs may have independent applications elsewhere. 

Our work shows that there are several instances of parameters where stochastic budget opti- 
mizations are solvable with reasonable computational resource even with multiple slots. Our hope 
is that therefore, in practice, one can carefully model particular applications such as sponsored 
search, so that the parameters are suitable, and advertisers can optimize their campaigns more 
effectively than is typically done now by applying some of the algorithms in this paper. 
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